\subsection{Windows NT: Critical section}
\myindex{Windows}

\label{critical_sections}


Critical sections in any \ac{OS} are very important in multithreaded environment,
mostly for giving a guarantee
that only one thread can access some data in a single moment of time, 
while blocking other threads and interrupts.

\par
That is how a \TT{CRITICAL\_SECTION} 
structure is declared in \gls{Windows NT} line OS:

\begin{lstlisting}[caption=(Windows Research Kernel v1.2) public/sdk/inc/nturtl.h,style=customc]
typedef struct _RTL_CRITICAL_SECTION {
    PRTL_CRITICAL_SECTION_DEBUG DebugInfo;

    //
    //  The following three fields control entering and exiting the critical
    //  section for the resource
    //

    LONG LockCount;
    LONG RecursionCount;
    HANDLE OwningThread;        // from the thread's ClientId->UniqueThread
    HANDLE LockSemaphore;
    ULONG_PTR SpinCount;        // force size on 64-bit systems when packed
} RTL_CRITICAL_SECTION, *PRTL_CRITICAL_SECTION;
\end{lstlisting}

That's is how EnterCriticalSection() function works:

\myindex{x86!\Instructions!LOCK}
\begin{lstlisting}[caption=Windows 2008/ntdll.dll/x86 (begin),style=customasmx86]
_RtlEnterCriticalSection@4

var_C           = dword ptr -0Ch
var_8           = dword ptr -8
var_4           = dword ptr -4
arg_0           = dword ptr  8

                mov     edi, edi
                push    ebp
                mov     ebp, esp
                sub     esp, 0Ch
                push    esi
                push    edi
                mov     edi, [ebp+arg_0]
                lea     esi, [edi+4] ; LockCount
                mov     eax, esi
                lock btr dword ptr [eax], 0
                jnb     wait ; jump if CF=0

loc_7DE922DD:
                mov     eax, large fs:18h
                mov     ecx, [eax+24h]
                mov     [edi+0Ch], ecx
                mov     dword ptr [edi+8], 1
                pop     edi
                xor     eax, eax
                pop     esi
                mov     esp, ebp
                pop     ebp
                retn    4

... skipped
\end{lstlisting}

\myindex{x86!\Instructions!BTR}
\myindex{x86!\Prefixes!LOCK}

The most important instruction in this code fragment is \TT{BTR} 
(prefixed with \TT{LOCK}): 

the zeroth bit is stored in the CF flag and cleared in memory.
This is an \gls{atomic operation}, 

blocking all other CPUs' access to this piece of memory 
(see the \TT{LOCK} prefix before the \TT{BTR} instruction).
If the bit at \TT{LockCount} is 1, 

fine, reset it and return from the function: we are in a critical section.

If not---the critical section is already occupied by other thread, so wait. \\
The wait is performed there using WaitForSingleObject(). \\
\\
And here is how the LeaveCriticalSection() function works:

\begin{lstlisting}[caption=Windows 2008/ntdll.dll/x86 (begin),style=customasmx86]
_RtlLeaveCriticalSection@4 proc near

arg_0           = dword ptr  8

                mov     edi, edi
                push    ebp
                mov     ebp, esp
                push    esi
                mov     esi, [ebp+arg_0]
                add     dword ptr [esi+8], 0FFFFFFFFh ; RecursionCount
                jnz     short loc_7DE922B2
                push    ebx
                push    edi
                lea     edi, [esi+4]    ; LockCount
                mov     dword ptr [esi+0Ch], 0
                mov     ebx, 1
                mov     eax, edi
                lock xadd [eax], ebx
                inc     ebx
                cmp     ebx, 0FFFFFFFFh
                jnz     loc_7DEA8EB7

loc_7DE922B0:
                pop     edi
                pop     ebx

loc_7DE922B2:
                xor     eax, eax
                pop     esi
                pop     ebp
                retn    4

... skipped
\end{lstlisting}

\myindex{x86!\Instructions!XADD}
\TT{XADD} is \q{exchange and add}.

In this case, it adds 1 to \TT{LockCount}, meanwhile saves initial value of \TT{LockCount} in the \TT{EBX} register.
However, value in \TT{EBX} is to incremented with a help of subsequent \INS{INC EBX}, and it also will be equal to
the updated value of \TT{LockCount}.

This operation is atomic since it is prefixed by \TT{LOCK} as well,
meaning that all other CPUs or CPU cores in system are blocked from accessing this point in memory.

The \TT{LOCK} prefix is very important: 

without it two threads, each of which works on separate CPU or CPU core can try to
enter a critical section and to modify the value in memory,
which will result in non-deterministic behavior.

% TODO linux

